First named inventor: Bossemeyer Page 3 

Serial no. 10/698,629 
Filed 10/31/2003 

Attorney docket no. 1048.002US1 



In the claims 

1. (currently amended) A method comprising: 

receiving a signal representing digitized, sampled human speech; 
locating at least one speech segment within the signal; 

locating one or more higher energy sections within each speech segment within the signal; 

locating a plurality of glottal events within each speech segment within the signal, based 
on the one or more higher energy sections within each speech segment; and, 

confirming the plurality of glottal events located within each speech segment within the 
signal, including registering each of at least one of the plurality of glottal events with adjacent 
glottal events^ 

wherein confirming the plurality of glottal events located within each speech segment 

comprises, for each adjacent pair of glottal events within each speech segment: 

comparing a first glottal event and a second glottal event of the adjacent pair of 

glottal events to determine a pair-wise distance between the first and the second glottal events: 
and. 

adjusting boundaries of at least one of the first glottal event and the second glottal 

event to minimize the pair-wise distance between the first and the second glottal events, 
maximizing similarity of the first and the second glottal events of the adjacent pair, such that 
adjusting the boundaries of at least one of the first glottal event and the second glottal event 
resuhs in the pair-wise distance between the first and the second glottal events being minimized . 

2. (original) The method of claim 1, wherein receiving the signal representing the digitized, 
sampled human speech comprises: 

recording human speech; and, 

sampling the human speech to digitize the human speech, yielding the signal. 
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3. (original) The method of claim 1, wherein locating at least one speech segment within the 
signal comprises determining a start point and an end point of each speech segment. 

4. (original) The method of claim 1, wherein locating at least one speech segment within the 

signal comprises determining an energy within the signal and examining the energy for regions 
above a threshold, such that each region above of the threshold corresponds to a speech segment. 

5. (original) The method of claim 1, wherein locating the one or more higher energy sections 
within each speech segment comprises determining regions within each speech segment where an 
energy is at least a percentage of a peak energy within the speech segment. 

6. (original) The method of claim 1 , wherein locating the plurality of glottal events within 
each speech segment comprises, for each speech segment: 

subjecting each higher energy section within the speech segment to a linear predictive 
coefficient (LPC) analysis, yielding a LPC residual error signal for each higher energy section; 

locating a number of largest peaks within the LPC residual error signal for each higher 
energy section that have a minimum separation between adjacent of the peaks; and, 

locating the plurality of glottal events within the speech segment as corresponding to the 
number of largest peaks within the LPC residual error signal that have the minimum separation. 

7. (original) The method of claim 6, wherein subjecting each higher energy section to LPC 
analysis, yielding the LPC residual error signal, comprises, for each higher energy section, 
determining the LPC residual error signal as the square of the difference between the higher 
energy section and an LPC-derived model of the higher energy section. 
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8. (original) The method of claim 6, wherein locating the number of largest peaks within the 
LPC residual error signal that have the minimum separation between adjacent of the peaks 
comprises, from all the largest peaks within the LPC residual error signal, removing those peaks 
that lack the minimum separation between adjacent of the peaks. 

9. (cancelled) 

10. (currently amended) A computer-readable medium having a computer program stored 
thereon to perform a glottal event confirmation method comprising: 

for each adjacent pair of glottal events within each of a plurality of speech segments within 
a signal representing digitized, sampled human speech, 

comparing a first glottal event and a second glottal event of the adjacent pair of 
glottal events to determine a pair-wise distance between the first and the second glottal events; 
and, 

adjusting boundaries of at least one of the first glottal event and the second glottal 
event to minimize the pair-wise distance between the first and the second glottal events, such that 
adjusting the boundaries of at least one of the first glottal event and the second glottal event 
results in the pair-wise distance between the first and the second glottal events being minimized. 

and such that the glottal event confirmation method increases accuracy of subsequently 
performed speaker verification methods. 

1 1 . (original) The medium of claim 10, wherein adjusting the boundaries of at least one of the 
first glottal event and the second glottal event comprises adjusting at least one of a start point and 
an end point of at least one of the first glottal event and the second glottal event. 
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12. (original) The medium of claim 10, wherein adjusting the boundaries of at least one of the 
first glottal event and the second glottal event maximizes similarity of the first and the second 
glottal events. 

13. (original) The medium of claim 10, wherein the method further comprises initially locating 
a plurality of glottal events within each speech segment within the signal. 

14. (original) The medium of claim 13, wherein locating the plurality of glottal events within 
each speech segment comprises, for each speech segment: 

subjecting each of a plurality of higher energy sections within the speech segment to a 
linear predictive coefficient (LPC) analysis, yielding a LPC residual error signal for each higher 
energy section; 

locating a number of largest peaks within the LPC residual error signal for each higher 
energy section that have a minimum separation between adjacent of the peaks; 

locating the plurality of glottal events within the speech segment as corresponding to the 
number of largest peaks within the LPC residual error signal that have the minimum separation; 

removing any of the plurality of glottal events within the speech segment that have a zero 
crossing rate greater than a threshold rate; and, 

removing any of the plurality of glottal events within the speech segment that have a 
duration outside of a threshold pitch interval range. 

15. (original) The medium of claim 13, wherein the method fiirther comprises, prior to 
locating the plurality of glottal events within each speech segment: 

locating the plurality of speech segments within the signal; and, 
locating one or more higher energy sections within each speech segment. 
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16. (original) The medium of claim 15, wherein the method fiirther comprises, prior to 
locating the plurality of speech segments within the signal, receiving the signal. 

17. (currently amended) A speaker verification system comprising: 

a computer-readable medium having stored thereon a plurality of first glottal events 
extracted from previously recorded human speech; and, 

a recording device to record further human speech and store a signal representing the 
fiirther human speech on the computer-readable medium; and, 

a mechanism to generate a plurality of second glottal events fi-om the signal, to confirm 
the plurality of second glottal events by registering each second glottal event with adjacent second 
glottal events, and to compare the plurality of second glottal events with the plurality of first 
glottal events to determine whether the further human speech recorded matches the previously 
recorded human speech^ 

wherein the mechanism is to. for each adjacent pair of the glottal events within each 

speech segment, each adjacent pair of the glottal events including a leading glottal event and a 
lagging glottal event, adjust boundaries of at least one of the leading glottal event and the lagging 
glottal event to minimize a pair-wise distance between the leading and the lagging glottal events, 
maximizing similarity of the leading and the lagging glottal events of the adjacent pair, such that 
adjusting the boundaries of at least one of the leading glottal event and the lagging glottal event 
results in the pair- wise distance between the leading and the lagging glottal events being 
minimized . 

18. (original) The speaker verification system of claim 17, wherein accuracy of determining 
whether the further human speech recorded matches the previously recorded human speech is 
increased by the mechanism confirming the plurality of second glottal events by registering each 
second glottal event with adjacent second glottal events. 
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19. (original) The speaker verification system of claim 17, wherein the mechanism is a 
computer program stored on the computer-readable medium. 

20. (currently amended) A speaker verification system comprising: 

first means for recording human speech and for storing a signal representing the human 
speech on a computer-readable medium having previously stored thereon a plurality of first glottal 
events extracted fi-om previously recorded human speech; and, 

second means for generating a plurality of second glottal events fi-om the signal, for 
confirming the plurality of second glottal events by registering each second glottal event with 
adjacent second glottal events, and for comparing tlie plurality of second glottal events with the 
plurality of first glottal events to determine whether the further human speech recorded matches 
the previously recorded human speech^ 

wherein the means is to. for each adjacent pair of the glottal events within each speech 

segment, each adjacent pair of the glottal events including a leading glottal event and a lagging 
glottal event, adjust boundaries of at least one of the leading glottal event and the lagging glottal 
event to minimize a pair-wise distance between the leading and the lagging glottal events, 
maximizing similarity of the leading and the lagging glottal events of the adjacent pair, such that 
adjusting the boundaries of at least one of the leading glottal event and the lagging glottal event 
results in the pair- wise distance between the leading and the lagging glottal events being 
minimized . 



